Retrieving Data From More Than One Table
The examples presented up to now in this chapter have illustrated the essential features of simple SELECT statements with data retrieval from single tables. However, much of the power of SQL lies in the ability to perform joins through a single statement, i.e. to select data from two or more tables, using the search condition to link the tables in a meaningful way.
The Join Condition
In retrieving data from more than one table, the search condition or join condition specifies the way the tables are to be linked. For example:
List the product name in addition to the EAN and price:
SELECT product, ean_code, price FROM items JOIN products ON items.product_id = products.product_id;
The join condition here is ITEMS.PRODUCT_ID = PRODUCTS.PRODUCT_ID, which relates the product identifier in table ITEMS (where codes are listed) to the product identifier in table PRODUCTS (where names are listed).
Returns:
PRODUCT |
|
|
100 Anos |
|
|
12 Golden Country Greats |
|
|
12 Super Exitos |
|
|
1492: Conquest of Paradise |
|
|
... |
|
|
Conceptually, the join first establishes a table containing all combinations of the rows in PRODUCTS with the rows in ITEMS, then selects those rows in which the two PRODUCT_ID values are equal. See Conceptual Description of the Selection Process for a fuller description of the conceptual SELECT process.
This does not necessarily represent the order in which the operations are actually performed; the order of evaluation of a complex SELECT statement is determined by the SQL optimizer, regardless of the order in which the component clauses are written.
Cross Products
Without the join condition, the result is a cross product of the columns in the tables in question, containing all possible combinations of the selected columns, for example:
SELECT product, ean_code, price FROM items CROSS JOIN products;
or
SELECT product, ean_code, price FROM items, products;
Returns:
PRODUCT |
|
|
'Murder In The Cathedral' |
|
|
'Murder In The Cathedral' |
|
|
'Murder In The Cathedral' |
|
|
'Murder In The Cathedral' |
|
|
... |
|
|
It is easy to see that a carelessly formulated join query can produce a very large result table. Two tables of 100 rows each, for instance, give a cross product with 10,000 rows; three tables of 100 rows each give a cross product with 1,000,000 rows!
The risk of generating large (erroneous) result tables is particularly high in interactive SQL (e.g. when using Mimer BSQL), where queries are so easily written and submitted.
Simple Joins
In simple joins, all tables used in the join are listed in the FROM clause of the SELECT statement. This is in distinction to nested joins, where the search condition for one SELECT is expressed in terms of another SELECT, see Nested Selects.
Example
SELECT product, ean_code, price FROM items JOIN products ON items.product_id = products.product_id;
SELECT *
The form SELECT * may be used in a join query, but since this selects all columns in the result set, at least one column is often duplicated (a join condition column).
Example
SELECT * FROM items JOIN products ON items.product_id = products.product_id;
Returns:
Columns from ITEMS:
ITEM_ID |
PRODUCT_ID |
FORMAT_ID |
RELEASE_DATE |
STATUS |
PRICE |
STOCK |
REORDER_LEVEL |
EAN_CODE |
PRODUCER_ID |
IMAGE_ID |
|
Columns from PRODUCTS:
PRODUCT |
PRODUCT_ID |
PRODUCT_SEARCH |
Columns in the join query that are uniquely identified by the column name may be specified by name alone. Columns that have the same name in the joined tables must be qualified by their respective table names.
The same query as above, but only three columns are returned:
SELECT product, ean_code, price FROM items JOIN products ON items.product_id = products.product_id;
Nesting Join Clauses
It is possible to nest join-clauses, for example:
List the category in addition to the EAN and price for any items released in December 1996:
SELECT ean_code, price, category FROM items JOIN formats ON items.format_id = formats.format_id JOIN categories ON categories.category_id = formats.category_id WHERE release_date BETWEEN date'1996-12-01' AND date'1996-12-30';
Result:
EAN_CODE |
|
CATEGORY |
9780006498957 |
|
Books |
724385487521 |
|
Music |
731453076723 |
|
Music |
53308951925 |
|
Music |
Complex Search Conditions and Joins
A join query can join any number of tables using complex search conditions to select the relevant information from each table.
List the product for any items with a release date in the future along with the item price in both Swedish and Danish crowns (SEK and DKK respectively):
SELECT product, CAST(price * exchange_rate AS DECIMAL(12,2)) AS cost, currency FROM items JOIN products ON products.product_id = items.product_id CROSS JOIN currencies WHERE release_date > CURRENT_DATE AND currencies.code IN ('SEK', 'DKK') ORDER BY product, currency;
Result:
PRODUCT |
|
CURRENCY |
Greatest Hits |
|
Danish Kronor |
Greatest Hits |
|
Swedish Kronor |
Pieces Of Fish |
|
Danish Kronor |
Pieces Of Fish |
|
Swedish Kronor |
The Future Foretold |
|
Danish Kronor |
The Future Foretold |
|
Swedish Kronor |
The Sql Quiz Book |
|
Danish Kronor |
The Sql Quiz Book |
|
Swedish Kronor |
In formulating a search condition for a join query, it can help to write out the columns that would appear in a complete cross-product of the tables. The search condition is then formulated as though the query was a simple SELECT from the cross-product table.
Outer Joins
The joins in the previous sections were all inner joins. In an inner join between two tables, only rows that fulfill the join condition are present in the result.
An outer join, on the contrary, contains non-matching rows as well. The outer join has two options, LEFT and RIGHT.
Left Outer Join
Example
SELECT ean_code, release_date, producer FROM items LEFT OUTER JOIN producers ON items.producer_id = producers.producer_id WHERE ean_code >= 800000000000 ORDER BY ean_code;
Result:
|
RELEASE_DATE |
PRODUCER |
|
1998-08-11 |
Giants Of Jazz (Ita) |
|
2000-10-31 |
Warp Records |
|
1999-09-28 |
- |
|
1998-06-30 |
- |
|
2000-03-14 |
Mint / Cherry Red |
|
1999-10-12 |
Cherry Red |
|
1994-12-27 |
Receiver Records |
|
1994-12-15 |
Receiver Records |
|
1996-01-18 |
Status Records |
|
... |
... |
In the example above all rows from the table to the left in the join clause, i.e. ITEMS, are present in the result; non-matching rows from the PRODUCERS table are filled with null values in the result.
Observe the difference in result for the next statement and the previous one.
SELECT ean_code, release_date, producer FROM items LEFT OUTER JOIN producers ON items.producer_id = producers.producer_id AND ean_code >= 800000000000 ORDER BY ean_code;
Result:
|
RELEASE_DATE |
PRODUCER |
|
1991-08-27 |
- |
|
1991-10-22 |
- |
|
1992-05-19 |
- |
|
1992-12-22 |
- |
|
... |
... |
|
1998-08-11 |
Giants Of Jazz (Ita) |
|
2000-10-31 |
Warp Records |
|
1999-09-28 |
- |
|
1998-06-30 |
- |
|
2000-03-14 |
Mint / Cherry Red |
|
... |
... |
The reason is that conditions in the where clause are applied to the result of the join-clause and not to the joined tables as is the case with the conditions in the on-clause.
Right Outer Join
A right outer join will take all records from the table to the right in the join-clause.
Nesting Outer Joins
As with inner joins, it is possible to nest join-clauses. Nested joins can be of different types, i.e. both inner and outer joins.
The result of nested outer joins can be somewhat unexpected though, as it is the result of the first join-clause that is the left table in the next join, not the right table in the first join-clause.
Example
SELECT * FROM tableA LEFT JOIN tableB ON tableA.id = tableB.id LEFT JOIN tableC ON tableA.id = tableC.id
This query does first perform tableA LEFT JOIN tableB. The result is then used as left table when performing LEFT JOIN tableC.
To make this query clearer, parentheses can be added as:
SELECT * FROM (tableA LEFT JOIN tableB ON tableA.id = tableB.id) LEFT JOIN tableC ON tableA.id = tableC.id
Nested Selects
A form of SELECT, called a subselect, can be used in the search condition of a SELECT statement to form a nested query.
The main SELECT statement is then referred to as the outer select.
For example:
Select the products that have a release date in the future.
SELECT product FROM products WHERE product_id IN (SELECT product_id FROM items WHERE release_date > CURRENT_DATE);
Result:
PRODUCT |
Greatest Hits |
Pieces Of Fish |
The Future Foretold |
The Sql Quiz Book |
To see how this works, evaluate the subselect first:
SELECT product_id FROM items WHERE release_date > CURRENT_DATE;
Result:
PRODUCT_ID |
30206 |
30618 |
31082 |
31083 |
Then use the result of the subselect in the search condition of the outer select:
SELECT product FROM products WHERE product_id IN (30206, 30618, 31082, 31083);
Result:
PRODUCT |
Greatest Hits |
Pieces Of Fish |
The Future Foretold |
The Sql Quiz Book |
Using Subselects
A subselect can be used in a search condition wherever the result of the subselect can provide the correct form of the data for the search condition.
Thus a subselect used with `=' must give a single value as a result.
A subselect used with IN, ALL or ANY must give a set of single values, see Retrieval with ALL, ANY, SOME.
A subselect used with EXISTS may give any result, see Retrieving Data Using EXISTS and NOT EXISTS.
Examples:
WHERE column = (subselect) WHERE column IN (subselect) WHERE column = ALL (subselect) WHERE column = ANY (subselect) WHERE EXISTS (subselect)
The UNION, EXCEPT and INTERSECT operators can be used to combine two or more subselects in more complex statements, see Union, Except and Intersect Queries.
Nested Queries
Many nested queries can equally well be written as simple joins. For example:
Select the products that have a release date in the future.
SELECT product FROM products WHERE product_id IN (SELECT product_id FROM items WHERE release_date > CURRENT_DATE);
or alternatively
SELECT DISTINCT product FROM products JOIN items ON products.product_id = items.product_id WHERE items.release_date > CURRENT_DATE;
Both these queries give exactly the same result. In most cases, the choice of which form to use is a matter of personal preference. Choose the form which you can understand most easily; the clearest formulation is least likely to cause problems.
Subselects in Queries
Queries may contain any number of subselects, for example:
List the producers (manufacturers) which have items that are more expensive than any of the items produced by Sony.
SELECT producer FROM producers WHERE producer_id IN (SELECT producer_id FROM items WHERE price > (SELECT MAX(price) FROM items WHERE producer_id = (SELECT producer_id FROM producers WHERE producer = 'SONY')));
Note the balanced parentheses for the nested levels.
It is particularly important at this level of complication to think carefully through the query to make sure that it is correctly formulated.
Often, writing some of the levels as simple joins can simplify the structure. The previous example may also be written:
SELECT DISTINCT producer FROM producers JOIN items ON producers.producer_id = items.producer_id WHERE price > (SELECT MAX(price) FROM items JOIN producers ON items.producer_id = producers.producer_id WHERE producer = 'SONY');
Correlation Names
A correlation name is a temporary name given to a table to represent a logical copy of the table within a query.
There are three uses for correlation names:
- simplifying complex queries
- joining a table to itself
- outer references in subselects
Simplifying Complex Queries Using Correlation Names
Using short correlation names into complicated queries can make the query easier to write and understand, particularly when qualified table names are used:
SELECT mimer_store_music.artists.artist, mimer_store.product_details.* FROM mimer_store.product_details JOIN mimer_store_music.titles ON mimer_store.product_details.item_id = mimer_store_music.titles.item_id JOIN mimer_store_music.artists ON mimer_store_music.artists.artist_id = mimer_store_music.titles.artist_id ORDER BY mimer_store_music.artists.artist;
may be rewritten
SELECT art.artist, pdt.* FROM mimer_store.product_details AS pdt JOIN mimer_store_music.titles AS ttl ON pdt.item_id = ttl.item_id JOIN mimer_store_music.artists AS art ON art.artist_id = ttl.artist_id ORDER BY art.artist;
The keyword AS in the FROM clause may be omitted, but is recommended for clarity.
About Correlation Names
Correlation names are local to the query in which they are defined.
When a correlation name is introduced for a table name, all references to the table in the same query must use the correlation name.
The following expression is not accepted:
... FROM mimer_store.product_details AS pdt, mimer_store_music.titles AS ttl, ... WHERE ttl.item_id = mimer_store.product_details.item_id
Joining a Table with Itself Using a Correlation Name
Joining a table with itself allows you to compare information in a table with other information in the same table. This can be done with a correlation name.
Select all currencies with the same exchange rate:
SELECT c.currency, c.code, c.exchange_rate FROM currencies AS c JOIN currencies AS copy ON c.exchange_rate = copy.exchange_rate AND c.currency <> copy.currency;
Result:
CURRENCY |
CODE |
|
Croatian Kuna |
HRK |
|
Gourdes |
HTQ |
|
Iraqi Dina |
IQD |
|
Uganda Shillings |
UGX |
|
Here, the table CURRENCIES is joined to a logical copy of itself called COPY.
The first search condition finds pairs of currencies with the same exchange rate, and the second eliminates 'pairs' with the same currency name. Without the second condition in the search criteria, all currencies would be selected!
Without correlation names, this kind of query cannot be formulated. The following query would select all the currencies from the table:
SELECT currency, code, exchange_rate FROM currencies WHERE currencies.exchange_rate = currencies.exchange_rate;
Outer References in Subselects Using Correlation Names
In some constructions using subselects, a subselect at a lower level may refer to a value in a table addressed at a higher level. This kind of reference is called an outer reference.
SELECT currency FROM currencies WHERE EXISTS (SELECT * FROM countries WHERE currency_code = currencies.code);
This kind of query processes the subselect for every row in the outer select, and the outer reference represents the value in the current outer select row. In descriptive terms, the query says 'For each row in CURRENCIES, select the CURRENCY column if there are rows in COUNTRIEScontaining the current CODE value'.
If the qualifying name in an outer reference is not unambiguous in the context of the subselect, a correlation name must be defined in the outer select.
A correlation name may always be used for clarity, as in the following example:
SELECT currency FROM currencies AS c WHERE EXISTS (SELECT * FROM countries WHERE currency_code = c.code);
Retrieving Data Using EXISTS and NOT EXISTS
EXISTS is used to check for the existence of some row or rows which satisfy a specified condition. EXISTS differs from the other operators in that it does not compare specific values; instead, it tests whether a set of values is empty or not. The set of values is specified as a subselect.
The subselect following the EXISTS clause most often uses of `SELECT *' as opposed to `SELECT column-list' since EXISTS only searches to see if the set of values addressed by the subselect is empty or not - a specified column is seldom relevant in the subquery.
EXISTS (subselect) is true if the result set of the subselect is not empty
NOT EXISTS (subselect) is true if the result set of the subselect is empty
SELECT statements with EXISTS almost always include an outer reference linking the subselect to the outer select.
Examples of EXISTS
Find all currencies that are used in the COUNTRIES table:
SELECT currency FROM currencies AS c WHERE EXISTS (SELECT * FROM countries WHERE currency_code = c.code);
Without the outer reference, the select becomes a conditional `all-or-nothing' statement: perform the outer select if the subselect result is not empty, otherwise select nothing.
List all products where the producer (manufacturer) is not known:
SELECT product FROM products AS p WHERE EXISTS (SELECT * FROM items WHERE producer_id IS NULL AND product_id = p.product_id);
Examples of NOT EXISTS
The next example illustrates NOT EXISTS:
List all products where the producer (manufacturer) is not known:
SELECT product FROM products WHERE NOT EXISTS (SELECT * FROM items JOIN producers ON items.producer_id = producers.producer_id WHERE product_id = products.product_id);
Result:
PRODUCT |
Invictus |
Middle Of Nowhere |
Negated EXISTS
Negated EXISTS clauses must be handled with care. There are two semantic `opposites' to EXISTS, with very different meanings:
WHERE EXISTS (SELECT * FROM artists WHERE artist = 'Enigma')
is true if at least one artist is called Enigma.
WHERE NOT EXISTS (SELECT * FROM artists WHERE artist = 'Enigma')
is true if no artist is called Enigma.
But
WHERE EXISTS (SELECT * FROM artists WHERE artist <> 'Enigma')
is true if at least one artist is not called Enigma.
WHERE NOT EXISTS (SELECT * FROM artists WHERE artist <> 'Enigma')
is true if no artist is not called Enigma, that is if every artist is called Enigma.
Retrieval with ALL, ANY, SOME
Subselects that return a set of values may be used in the quantified predicates ALL, ANY or SOME. Thus
WHERE PRICE < ALL (subselect)
selects rows where the price is less than every value returned by the subselect
WHERE PRICE < ANY (subselect)
selects rows where the price is less than at least one of the values returned by the subselect
Select countries that have an exchange rate of less than one:
SELECT country FROM countries WHERE currency_code <> ALL (SELECT code FROM currencies WHERE exchange_rate >= 1.0);
If the result of the subselect is an empty set, ALL evaluates to true, while ANY or SOME evaluates to false.
An alternative to using ALL, ANY or SOME in a value comparison against a general sub-select, is to use EXISTS or NOT EXISTS to see if values are returned by a sub-select which only selects for specific values. For example:
Select countries where the associated currency code contains the letter 'E' as the middle character in the code:
SELECT country FROM countries WHERE currency_code = ANY (SELECT code FROM currencies WHERE code LIKE '_E_');
is equivalent to:
SELECT country FROM countries AS c WHERE EXISTS (SELECT * FROM currencies WHERE code LIKE '_E_' AND code = c.currency_code);
Union, Except and Intersect Queries
The UNION, EXCEPT and INTERSECT operators combine the results of two select clauses.
UNION first merges the result tables specified by the separate selects and then eliminates duplicate rows from the merged set. (UNION ALL does not eliminate duplicate rows.)
EXCEPT takes the distinct rows from the first select and returns the rows that do not appear in the second select. (EXCEPT ALL does not eliminate duplicate rows.)
INTERSECT takes the results of two selects and returns only rows that appear in both selects, after removing duplicate rows from the final result set. (INTERSECT ALL does not eliminate duplicate rows.)
Columns which are merged by UNION, EXCEPT and INTERSECT must have compatible data types (numerical with numerical, character with character, etc).
Subselects addressing more than one result column are merged column by column in the order of selection. The number of columns addressed in each subselect must be the same.
The column names in the result of a UNION, EXCEPT or INTERSECT are taken from the names in the first subselect. Use labels in the first subselect to assign different column names to the result table.
In UNION, EXCEPT and INTERSECT queries, you may need to add an empty column so that columns not represented in both queries in the statement are retained in the result set. This is done by casting a null value to a matching datatype.
Example
SELECT ean_code, release_date, producer FROM items INNER JOIN producers ON items.producer_id = producers.producer_id UNION ALL SELECT ean_code, release_date, CAST(NULL AS char) FROM items WHERE NOT EXISTS (SELECT * FROM producers WHERE items.producer_id = producers.producer_id)
UNION Examples
Select the different codes for currencies and countries that start with the letter 'D':
SELECT code FROM currencies WHERE code LIKE 'D%' UNION SELECT currency_code FROM countries WHERE country LIKE 'D%';
The result is obtained by merging the results of the two selects and eliminating duplicates:
SELECT code SELECT currency_code FROM currencies FROM currencies WHERE code LIKE 'D%; WHERE country LIKE 'D%';
and the UNION gives the result table:
To retain duplicates in the result table, use UNION ALL in place of UNION, see the Mimer SQL Reference Manual, UNION or UNION ALL, for details.
Merge the codes and names of currencies where the code begins with 'D' with the codes and names of the countries where the country begins with 'D':
SELECT code, currency AS currency_or_country FROM currencies WHERE code LIKE 'D%' UNION SELECT currency_code, country FROM countries WHERE country LIKE 'D%' ORDER BY code;
Result:
|
CURRENCY_OR_COUNTRY |
|
Djibouti |
|
Djibouti Francs |
|
Danish Kronor |
|
Denmark |
|
Dominican Pesos |
|
Dominican Republic |
|
Algerian Dinars |
|
Dominica |
Find the lowest and highest exchange_rates:
Unions can be used to combine information from the same table.
SELECT 'Highest', MAX(exchange_rate) AS rate FROM currencies UNION ALL SELECT 'Lowest', MIN(exchange_rate) FROM currencies ORDER BY rate;
Result:
|
RATE |
Lowest |
0.2644 |
Highest |
1035000.0000 |
EXCEPT Examples
Select the codes from currencies, except those that also are found in countries, starting with the letter 'D':
SELECT code FROM currencies WHERE code LIKE 'D%' EXCEPT SELECT currency_code FROM countries WHERE country LIKE 'D%';
The result is obtained by taking the first result and then remove the rows also found in the second select, and finally eliminating duplicates:
SELECT code SELECT currency_code FROM currencies FROM currencies WHERE code LIKE 'D%; WHERE country LIKE 'D%';
and the EXCEPT gives the result table:
To retain duplicates in the result table, use EXCEPT ALL in place of EXCEPT, see the Mimer SQL Reference Manual, EXCEPT or EXCEPT ALL, for details.
INTERSECT Examples
Select the codes from currencies and countries that exist in both tables, starting with the letter 'D':
SELECT code FROM currencies WHERE code LIKE 'D%' INTERSECT SELECT currency_code FROM countries WHERE country LIKE 'D%';
The result is obtained by taking the first result and then remove the rows also found in the second select, and finally eliminating duplicates:
SELECT code SELECT currency_code FROM currencies FROM currencies WHERE code LIKE 'D%; WHERE country LIKE 'D%';
and the INTERSECT gives the result table:
To retain duplicates in the result table, use INTERSECT ALL in place of INTERSECT, see the Mimer SQL Reference Manual, INTERSECT or INTERSECT ALL, for details.
http://developer.mimer.se/documentation/html_110/Mimer_SQL_Engine_DocSet/Retreiving_Data17.html